Phonetic variability constrained bottleneck features for joint speaker recognition and physical task stress detection
نویسندگان
چکیده
منابع مشابه
Bottleneck features for speaker recognition
Bottleneck neural networks have recently found success in a variety of speech recognition tasks. This paper presents an approach in which they are utilized in the front-end of a speaker recognition system. The network inputs are melfrequency cepstral coefficients (MFCCs) from multiple consecutive frames and the outputs are speaker labels. We propose using a recording-level criterion that is opt...
متن کاملAnalysis and Optimization of Bottleneck Features for Speaker Recognition
Recently, Deep Neural Network (DNN) based bottleneck features proved to be very effective in i-vector based speaker recognition. However, the bottleneck feature extraction is usually fully optimized for speech rather than speaker recognition task. In this paper, we explore whether DNNs suboptimal for speech recognition can provide better bottleneck features for speaker recognition. We experimen...
متن کاملCompensation for phonetic nuisance variability in speaker recognition using DNNs
In this paper, a new way of using phonetic DNN in textindependent speaker recognition is examined. Inspired by the Subspace GMM approach to speech recognition, we try to extract i-vectors that are invariant to the phonetic content of the utterance. We overcome the assumption of Gaussian distributed senones by combining DNN with UBM posteriors and we form a complete EM algorithm for training and...
متن کاملPhonetic Refraction for Speaker Recognition
This paper describes a newly realized highperformance speaker recognition system and examines methods for its improvement. Innovative experiments early this year showed that phone strings are exceptional features for speaker recognition. The original system produced equal error rates less than 11.5% on Switchboard-I audio files. Subsequent research indicates that the equal error rate can be nea...
متن کاملAutomatic recognition of speaker physical load using posterior probability based features from acoustic and phonetic tokens
This paper presents an automatic speaker physical load recognition approach using posterior probability based features from acoustic and phonetic tokens. In this method, the tokens for calculating the posterior probability or zero-order statistics are extended from the conventional MFCC trained Gaussian Mixture Models (GMM) components to parallel phonetic phonemes and tandem feature trained GMM...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Journal of the Acoustical Society of America
سال: 2020
ISSN: 0001-4966
DOI: 10.1121/10.0002455